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Abstract 

Background: The newly emerged Middle East respiratory syndrome coronavirus (MERS-CoV) that first appeared in 
Saudi Arabia during the summer of 2012 has to date (20th September 2013) caused 58 human deaths. MERS-CoV 
utilizes the dipeptidyl peptidase 4 (DPP4) host cell receptor, and analysis of the long-term interaction between virus 
and receptor provides key information on the evolutionary events that lead to the viral emergence. 

Findings: We show that bat DPP4 genes have been subject to significant adaptive evolution, suggestive of a 
long-term arms-race between bats and MERS related CoVs. In particular, we identify three positively selected 
residues in DPP4 that directly interact with the viral surface glycoprotein. 

Conclusions: Our study suggests that the evolutionary lineage leading to MERS-CoV may have circulated in 
bats for a substantial time period. 
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Main text 

Middle East respiratory syndrome coronavirus (MERS- 
CoV) [1], first described by the World Health Organization 
(WHO) on 23rd September 2012 [2,3] > has to date (20th 
September 2013) caused 130 laboratory-confirmed hu¬ 
man infections with 58 deaths (http://www.who.int/csr/ 
don/2013_09_20/en/index.html). MERS-CoV belongs to 
lineage C of the genus Betacoronavirus in the family 
Coronaviridae , and is closely related to Tylonycteris bat 
coronavirus HKU4 (BtCoV-HI<U4), Pipistrellus bat cor¬ 
onavirus HKU5 (Bt-HI<U5) [4,5] and Co Vs in Nycteris 
bats [6], suggestive of a bat-origin [6]. Unlike severe 
acute respiratory syndrome (SARS) CoV which uses the 
angiotensin-converting enzyme 2 (ACE2) receptor for cell 
entry [7], MERS-CoV employs the dipeptidyl peptidase 4 
receptor (DPP4; also known as CD26), and recent work 
has demonstrated that expression of both human and bat 
DPP4 in non-susceptible cells enabled viral entry [8]. 
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Cell-surface receptors such as DPP4 play a key role in 
facilitating viral invasion and tropism. As a consequence, 
the long-term co-evolutionary dynamics between hosts 
and viruses often leave evolutionary footprints in both 
receptor-encoding genes of hosts and the receptor-binding 
domains (RBDs) of viruses in the form of positively selected 
amino acid residues (i.e. adaptive evolution). For example, 
signatures of recurrent positive selection have been ob¬ 
served in ACE2 genes in bats [9], supporting the past 
circulation of SARS related CoVs in bats. To better under¬ 
stand the origins of MERS-CoV, as well as their potentially 
long-term (compared to short-term which lacks virus-host 
interaction) evolutionary dynamics with bat hosts [5,10], 
we studied the molecular evolution of DPP4 across the 
mammalian phylogeny. 

We first analyzed the selection pressures acting on bat 
DPP4 genes using the ratio of nonsynonymous (d N ) to 
synonymous (d s ) nucleotide substitutions per site (ratio 
cWds); d N > d s indicative of adaptive evolution. The 
complete DPP4 mRNA sequence of the common pipistrelle 
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Table 1 Sequences used in the evolutionary analysis of DDP4 


Common name 

Species name 

Family 

Accession no. 

Sheep 

Ovis aries 

Bovidae 

XM_004004660 

Killer whale 

Orcinus orca 

Delphinidae 

XM_004283621 

Cow 

Bos taurus 

Bovidae 

NMJ 74039 

Pig 

Sus scrota 

Suidae 

NM_214257 

Pacific walrus 

Odobenus rosmarus divergens 

Odobenidae 

XM_004410199 

Ferret 

Mustela putorius furo 

Mustelidae 

DQ266376 

Cat 

Felis catus 

Felidae 

NM_001009838 

Horse 

Equus caballus 

Equidae 

XM_001493999 

Rhinoceros 

Ceratotherium simum 

Rhinocerotidae 

XM_004428264 

Large flying fox 

Pteropus vampyrus 

Pteropodidae 

ENSPVAG00000002634 

Black flying fox 

Pteropus alecto 

Pteropodidae 

KB031068 

Common vampire bat 

Desmodus rotundus 

Phyllostomidae 

GABZO1004546 

Brandt's bat 

Myotis brandtii 

Vespertilionidae 

KE161360 

David's myotis 

Myotis davidii 

Vespertilionidae 

KB109552 

Little brown bat 

Myotis lucifugus 

Vespertilionidae 

GL429772 

Common pipistrelle 

Pipistrellus pipistrellus 

Vespertilionidae 

KC249974 

Guinea pig 

Cavia porcellus 

Caviidae 

XM_003478564 

Degu 

Octodon deg us 

Octodontidae 

XM_004629976 

Lesser Egyptian jerboa 

Jaculus jaculus 

Dipodidae 

XM_004651712 

Mouse 

Mus musculus 

Muridae 

BC022183 

Rat 

Rattus norvegicus 

Muridae 

NM_012789 

Human 

Homo sapiens 

Hominidae 

NM_001935 

Chimpanzee 

Pan troglodytes 

Hominidae 

GABE01002695 

Pygmy chimpanzee 

Pan paniscus 

Hominidae 

XM_003820939 

Gorilla 

Gorilla gorilla gorilla 

Hominidae 

XM_004032706 

Orangutan 

Pongo abelii 

Hominidae 

NM_001132869 

Gibbon 

Nomascus leucogenys 

Hylobatidae 

XM_003266171 

Olive baboon 

Papio anubis 

Cercopithecidae 

XM_003907539 

Rhesus monkey 

Macaca mulatta 

Cercopithecidae 

JU474559 

Galago 

Otolemur garnettii 

Galagidae 

XM_003795172 

Marmoset 

Callithrix jacchus 

Cebidae 

XM_002749392 

American pika 

Ochotona princeps 

Ochotonidae 

XM_004577330 


C Pipistrellus pipistrellus ) was downloaded from GenBank 
(www.ncbi.nlm.nih.gov/genbank/) along with that of the 
common vampire bat (Desmodus rotundus) from one 
transcriptome database (http://www.ncbi.nlm.nih.gov/ 
bioproject/178123). These sequences were then used to 
mine and extract DPP4 mRNA transcripts from a fur¬ 
ther five bat genomes (Table 1) using tBLASTn and 
GeneWise [11]. The complete DPP4 genes of bats and 
non-bat reference genomes from a range of mammalian 
species (Table 1) were aligned using MUSCLE [12] 
guided by translated amino acid sequences (n = 32; 727 


amino acids). We then compared a series of models within 
a maximum likelihood framework [13], incorporating the 
published mammalian species tree [14-16]. This analysis 
(the Free Ratio model) revealed that the d N /d s value on 
the bat lineage (0.96) was four times greater than the 
mammalian average (Figure 1). The higher d N /d s ratios 
leading to bats (Table 2) during mammalian evolution 
accord with the growing body of data [5,6,17,18] that the 
newly emerged MERS-CoV ultimately has a bat-origin. 

We next analysed the selection pressures at individual 
amino acid sites in bat DPP4. Using the Bayesian FUBAR 
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Figure 1 Selection pressures on DPP4 during mammalian 
evolution. Ratios of nonsynonymous (d N ) to synonymous (d s ) 
nucleotide substitutions per site (d N /d s ) are shown on four major 
ancestral branches; d N and d s numbers are also given in parentheses. 
Values for individual lineages are given in Table 2. DPP4 sequences of 
bat origin are shaded. 


method [19] in HyPhy package [20], we identified six 
codons that were assigned d N /d s > 1 with higher poster¬ 
ior probability (a strict cut-off of 95% in this analysis) 
(Table 3). To identify those sites under positive selection 
that may interact directly with MERS-CoV-like spike 
protein, bat DPP4 (from the common pipistrelle) was 
modelled against the structure of the human DPP4/ 
MERS-CoV spike complex [21] (Figure 2A). This revealed 
that three of the six positive selected residues (position 
187, 288 and 392) were located at the interface between 
bat DPP4 and MERS-CoV RBD (receptor binding do¬ 
main) (Figure 2). These residues therefore provide direct 
evidence of a long-term co-evolutionary history between 
viruses and their hosts. We also observed several variable 
regions (Figure 2B) within the bat RBD, that may also have 
resulted from virally-induced selection pressure and which 
merit additional investigation in a larger data set. 

Our analysis therefore suggests that the evolutionary 
lineage leading to current MERS-CoV co-evolved with 
bat hosts for an extended time period, eventually 
jumping species boundaries to infect humans and perhaps 
through an intermediate host. As such, the emergence of 


Table 2 Numbers of nonsynonymous (d N ) and synonymous 
(d s ) substitutions per site DPP4 genes in different mammals 


Common name 

d N 

d s 

d N /d s 

Sheep 

0.004 

0.013 

0.280 

Killer whale 

0.023 

0.039 

0.595 

Cow 

0.003 

0.016 

0.157 

Pig 

0.027 

0.109 

0.246 

Pacific walrus 

0.014 

0.053 

0.260 

Ferret 

0.015 

0.064 

0.235 

Cat 

0.021 

0.081 

0.258 

Horse 

0.016 

0.055 

0.290 

Rhinoceros 

0.017 

0.044 

0.385 

Large flying fox 

0.005 

0.001 

3.561 

Black flying fox 

0.004 

0.008 

0.487 

Common vampire bat 

0.042 

0.125 

0.500 

Brandt's bat 

0.006 

0.012 

0.463 

David's myotis 

0.010 

0.028 

0.380 

Little brown bat 

0.007 

0.007 

0.943 

Common pipistrelle 

0.031 

0.066 

0.470 

Guinea pig 

0.018 

0.078 

0.238 

Degu 

0.016 

0.128 

0.122 

Lesser Egyptian jerboa 

0.023 

0.179 

0.131 

Mouse 

0.019 

0.093 

0.206 

Rat 

0.027 

0.110 

0.248 

Human 

0.001 

0.007 

0.086 

Chimpanzee 

0.000 

0.002 

0.000 

Pygmy chimpanzee 

0.001 

0.000 

ND 

Gorilla 

0.003 

0.004 

0.863 

Orangutan 

0.002 

0.000 

ND 

Gibbon 

0.003 

0.009 

0.344 

Olive baboon 

0.000 

0.005 

0.000 

Rhesus monkey 

0.000 

0.004 

0.000 

Galago 

0.022 

0.149 

0.149 

Marmoset 

0.009 

0.053 

0.160 

American pika 

0.036 

0.229 

0.156 

ND: Not determined because 

no synonymous substitutions are present. 


Table 3 Putatively positive selected DPP4 codons in bats 

Codon position 0 

Posterior probability 6 

d N /d s 

46 

0.97 


14.95 

57 

0.97 


13.13 

112 

0.94 


10.27 

187 

0.95 


8.55 

288 

0.98 


13.90 

392 

0.97 


14.63 

a Codon position corresponding to the human DPP4 (NP_001926) protein sequence. 
^Posterior probability of residues assigned a d N /d s ratio greater than 1. 










































Cui et at. Virology Journal 2013, 10:304 
http://www.virologyj.eom/content/10/1 /304 


Page 4 of 5 




A 



B 

Human 

Common pipistrelle 
David’s myotis 
Brandt’s bat 
Common vampire bat 
Black flying fox 
Large flying fox 
Little brown bat 


Human 

Common pipistrelle 
David’s myotis 
Brandt's bat 
Common vampire bat 
Black flying fox 
Large flying fox 
Little brown bat 


Human 

Common pipistrelle 
David's myotis 
Brandt’s bat 
Common vampire bat 
Black flying fox 
Large flying fox 
Little brown bat 


Human 

Common pipistrelle 
David's myotis 
Brandt's bat 
Common vampire bat 
Black flying fox 
Large flying fox 
Little brown bat 


Human 

Common pipistrelle 
David’s myotis 
Brandt’s bat 
Common vampire bat 
Black flying fox 
Large flying fox 
Little brown bat 


50 60 70 80 90 100 110 

KTYTLTDYLKNTYRLKLYSLHWISDHEYLYKQENN[LVFNAEYGNSSVFLENSTFDEFGHSINDYSISPDGQ 
RTYTLSDYLKSTIRTRNYNLRWISDHEYLY FQENNILLFNADHGNSSTFLENSTFDQFGYSINDYSVSPDRR 
RTYTLADYLKSTIRMRNYNLRWISDHEYLY KQENNVLLFNADHGNSSTFLENSTFDQFGHSISDYSVSPDRQ 
RTYTLNDYLKSTIRTRNYNLRW1SDHEYLY QENN LLFNADHGNSSTFLENSTFDQFGHSISDYSVSPDRQ 
RTYTLSDYLKNTFRTKSYNLHWVSDHEYLYKQENNILLFNAEHGDSTVLLENSTFEKFEHSINDYSVSPDRN 
RTYTLTDYLKNTLRTKLYTLRWISDHEYLYIQENNILLFNAEYGNSSTFLENSTFDKFGHSINDYSVSPDRQ 
RTYTLTDYLKNTLRTKFYTLRWISDHEYLY! QENN LLFNAEYGNSSTFLENSTFDKFGHSVNDYSVSPDRQ 
RTYTLTDYLKSTIRTKNYNLRWTSDHEYLYKQENNILLFNADHGNSSTFLENSTFDQFGHSISDYSVSPDRR 

120 130 140 150 160 170 180 

FILLEYNYVKQWRHSYTASYDIYDL KRQLITEERIPNNTQWVTWSP GHKLAYVWNNDIYVKIEPNLPSYR 
F LIEYNYVKKWRHSYTASYDIYDLKKRQLITEERIPNDTQLISWSPEGHKLAYVWNNDIYIKNDPNSPPQR 
F L EYNYVKKWRHSYTASYDIYDLNKRQLITAERIPNDTQLIRWSPEGHKLAYVW NDVYVKNDPYSPSOR 
FVL EYNYVKKWRHSYTASYDIYDLNKRQLIT ERIPNDTQLI1WSPEGHKLAYVWDNDIYVKNDPYAPSQR 
FVLLEYNYVK-WRHSFTASYDIYDLNKRQLITFE <1PNDTQSI^WS PEGHKLAYVW NDIY KNEPNASSQR 
F L EYNYVKIWRHS TASYDIYDLFKRQLITEERIPNNTQFITWSPEGHKLAYVWNNDIYVKNEPNSSSQR 
FILLEYNYVKKWRHSYTASYDIYDL KRQLITEERIPNNTQFIAWSPEGHKLAYVWNNDIYVKNEPNSSSQR 
F L EYNYVK WRHS TASYDIYDL KRQLITEERIPNDTQLIRWSPEGHKLAYVWNNDVYVKNDPYSPSQR 

190 200 210 220 230 240 250 

ITW GKED 1IYNGITDWVYEFE.FSAYSALWWSPNGTFLAYAQFNDTF.VP IEYSFYSDES KQYPKTVRVPY 
VTDDGREDAISNGITDWVYEEEIFSTHSALWWSPNGTFLAYARFNDTEVPRIEYSVYLDES QYPKTVHLPY 
VTHDGREDAISNGITDWVYEEEIFS : SALWWSPN JTFLAYA FNDTLVPRIEYSVYLDESLQYPKT PY 
VTHDGREDAVSNGITDWVYEKEIFSTHSALWWSPNGTFLAYAQFNDTEVPRIEYSVYLDESLQYPKTIHIPY 
ITWTGKEDVINNGITDWVYEEEIFNTHSALWWSPNSTFLAYAQFNDTEVPRIEFSIYADESLQYPKTMHIPY 
ITWTGKENVISNGITDWVYEEEVFSAYSALWWSPN ITFLAYA.FNDTEVPLIEFSVYFDES QYPKT PY 
ITWTGKENVISNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPPIEFSVYFDES? QYPKT H PY 
VTSDGREDAISNGITDWVYEEEIFSTHSALWWSPNGTFLAYAQFNDTDVPRIEYSVYLDESLQYPKTIHIPY 

260 270 280 290 300 310 320 

PKAGA NPTVKFFVVNTDSLSSVTNATSIQITAPASMLIGDHYLCDVTWATQERISLQWL R QHYSVMDIC 

PKAGAENPTVKLYVVNTDNL-TNLVPVQITAPASVLTGDHYLCDVTWATKERISLQWLRR QNYSIIDIC 

PKAGA NPTVKFYVVNTDNL-TDLEPAQIVAPAS L GDHYLCDVTWATKERISLQWL R QNYS IDIC 

PKAGA NPTVKEYVVNTDNL-TNLDPVQIVAPAS.L GDHYLCDVTWATKERISLQWLIR QNYSIIDIC 

PKAGA NPTVKFFVVNTDNL-TNLVSVQITAPASMLIGDHYLCDVTWVTEERISLQWLQR QNYSVTDIC 

PKAGA NPTVKFFIVNINNL-TNAVSRQIVAPAS L GDHYLCDVTW T ERISLQWL R QNYS DIC 

PKAGA NPTVKFFIVNINNL-TNAVSRQIVAPAS L GDHYLCDVTW T ERISLQWLRRIQNYSVMDIC 

PKAGANNPTVKLYVVNTDNL-TNLDPVQI I APASVLIGDHYLCDVTWATKERISLQWLRR QNYS:IDIC 

330 340 350 360 370 380 390 400 

DYDESSGRWNCLVARQHIEMSTTGWVGRFRPSEPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKG 
DYDAPNSKWNCSVPRQHIEMSTTGWVGRFKPAEPHFTADGNSFYKIMSNSEGYKHICYFQVDNQKCTFITNG 
DYNESTPKWNCLVSRQHIETSATGWVGRFKPAEPHFTSDGNSFYKIMSNSEGYKHICLFQIDKPDCTFITKG 
DYNESTPIWNCLVSRQHIETSTTGWVGRFKPAEPHFTSDGNSFYKIMSNSEGYKHICLFQIDKKDCTFITKG 
DYEESSGRWNCLVRRQHIEESTTGWVGRFKPAEPHFTSDGNSFYKIISNHEGYKHICLFQVDKKPCIFITKG 
DYDESNGSWSCLVARQHLEISATGWVGRFKPSEPHFTSDGNSFYKIISNTEGYKHICLFQIDKKDCTFITEG 
DYDESNGSWSCLVARQHMEISATGWVGRFKPSEPHFTSDGNSFYKIISNTEGYKHICLFQIDKKDCTFITEG 
DYNESTPRW C VSRQH EI’S TGWVGRFKPAEPHFTSDGNSFYKI SNSEGYKHICLFQ DKEKCTFITKG 


Figure 2 Interaction of bat DPP4 and MERS-CoV spike protein receptor-binding domain and the location of positively selected sites. 

The structure was displayed using PyMol vl.6 (http://www.pymol.org/). (A) Homology model showing the structural interactions between bat 
DPP4 (from common pipistrelle) coloured grey and MERS-CoV spike protein receptor-binding domain coloured blue. The three positively selected 
residues (positions 187, 288 and 392) located within the interface where the virus-host interact are highlighted as red. (B) Protein alignment of 
human DPP4 compared to that of seven bat species showing RBD spanning codons 41 - 400. Conserved and variable positions are shown in 
black and grey text, respectively, and residues under positive selection are coloured red. 


MERS-CoV may parallel that of the related SARS-CoV 
[22]. Although one bat species, Taphozous erforatus, in 
Saudi Arabia has been found to harbour a small RdRp 
(RNA-Dependent RNA Polymerase) fragment of MERS- 
CoV [17], a larger viral sampling of bats and other animals 
with close exposure to humans, including dromedary 
camels were serological evidence for MERS-CoV has been 
identified [23], are clearly needed to better understand the 
viral transmission route. Alternatively, it is possible that 
the adaptive evolution present on the bat DPP4 was due 
to viruses other than MERS-CoVs, and which will need to 
be better assessed when a larger number of viruses are 
available for analysis. Overall, our study provides evidence 
that a long-term evolutionary arms race likely occurred 
between MERS related CoVs and bats. 
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